Semi-Mechanistic Gaussian Process Regression Model for Tumor Size

Jacqueline Buros

11 Sep 2024

Introduction

The models I will be discussing today were implemented by my colleage Juho Timonen.

My contribution to this project is mainly in motivating the use case and applying it in practice.

Motivation

Imagine you are analyzing data from an Oncology clinical trial.

  • 30 patients randomized to an experimental drug
  • 30 patients randomized to standard therapy.

We measure the size of patients’ tumors every 12 weeks, until they withdraw or stop therapy.

Key research question: Which treatment is more effective?

Tumor size data

For this analysis, we will focus on the tumor-size data.

Canonical Approach

The typical approach is to model these using a non-linear function called the “Stein-Fojo” or SF model (Stein et al. 2008)

\[ f^{\text{SF}}(t \mid k_g, k_s) = \exp(k_g t) + \exp(-k_s t) - 1, \]

where \(t\) is a measure of time and \(k_g, k_s\) are unknown growth and shrinkage parameters, respectively.

Canonical Approach

As an example, this model describes the example data we showed earlier well. Here we fit this to 2 subjects at random.

However …

Recent therapies elicit a greater variety of responses

Source: Borcoman et al. (2019)

Many therapies demonstrate heterogeneity of response

Source: Seidman et al. (2020)

SFGP package

SFGP

We released an open-source R package sfgp to address these shortcomings.

remotes::install_github('generable/sfgp')

This is largely the work of my colleage Juho Timonen.

Our approach

We start with the likelihood of observation \(i\) as

\[ \log(y_i + \delta) \sim \mathcal{N}(f_i, \sigma^2), \]

where

\[ f_i = f(\mathbf{x}_i)= \sum_{j=1}^J f^{(j)}(\mathbf{x}_i) \]

is the expected log tumor size, \(\sigma\) is an unknown parameter, and the \(\delta\) value is a constant.

The functions \(f^{(j)}\), \(j=1, \ldots, J\) are the additive function components called terms in the package documentation.

Terms

The baseline value is estimated per subject by default

\[ f^{\text{BAS}} \left(\text{id} \mid \mathbf{c}_{0}\right) \]

The SF term sf(t, id) implements the SF function described previously

\[ f^{\text{SF}}(t \mid k_g, k_s) = \exp(k_g t) + \exp(-k_s t) - 1 \]

In this example, both \(k_g\) and \(k_s\) have an estimated overall mean and varying effect per id. This can also be defined as `sf(t, id | arm) to add additional hierarchies.

The gp(t) term implements a Hilbert-Space reduced-rank guassian process regression term (HSGP) (Solin and Särkkä 2020).

\[ f^{\text{HSGP}} \left(\text{t} \mid \mathbf{\xi}_{\text{t}}, \alpha_{\text{t}}, \ell_{\text{t}}, B_{\text{t}}, L_{\text{t}}\right) \]

The gp term can be defined per category, as gp(t, arm):

\[ f^{\text{HSGP}} \left(\text{t}, \text{arm} \mid \mathbf{\xi}^{(1)}_{\text{t} \times \text{arm}}, \ldots, \mathbf{\xi}^{(G_{arm})}_{\text{t} \times \text{arm}}, \alpha_{\text{t} \times \text{arm}}, \ell_{\text{t} \times \text{arm}}, B_{\text{t} \times \text{arm}}, L_{\text{t} \times \text{arm}}\right) \]

where \(G_{\text{arm}}\) is the number of treatment arms and GP kernel hyper-parameters are shared between groups (here, treatment arms).

Models

SF Model

We define a basic SF model with \(k_s\) and \(k_g\) terms varying by id and arm

TSModel$new(
  y ~ sf(t, id | arm)
)

\[\begin{align} f^{(1)}(\mathbf{x}) &= f^{\text{log-SF}} \left(\text{t} \mid \mathbf{k}_{g}, \mathbf{k}_{s}\right)\\ f^{(2)}(\mathbf{x}) &= f^{\text{BAS}} \left(\text{id} \mid \mathbf{c}_{0}\right) \end{align}\].

SF+GP Model

Next we define an SF+GP Model. This includes both an SF and a GP term in addition to the baseline term. It effectively treats the SF as the mean function for the GP:

TSModel$new(
  y ~ sf(t, id | arm) + gp(t, arm)
)

\[\begin{align} f^{(1)}(\mathbf{x}) &= f^{\text{log-SF}} \left(\text{t} \mid \mathbf{k}_{g}, \mathbf{k}_{s}\right)\\ f^{(2)}(\mathbf{x}) &= f^{\text{HSGP}} \left(\text{t}, \text{arm} \mid \mathbf{\xi}^{(1)}_{\text{t} \times \text{arm}}, \ldots, \mathbf{\xi}^{(G_{arm})}_{\text{t} \times \text{arm}}, \alpha_{\text{t} \times \text{arm}}, \ell_{\text{t} \times \text{arm}}, B_{\text{t} \times \text{arm}}, L_{\text{t} \times \text{arm}}\right)\\ f^{(3)}(\mathbf{x}) &= f^{\text{BAS}} \left(\text{id} \mid \mathbf{c}_{0}\right) \end{align}\]

Alt SF+GP Model

Finally, we define an alternative SF+GP Model to show how both the \(k_s\) and \(k_g\) term sub-components can be specified.

TSModel$new(
  y ~ sff(
    t | ks ~ offset(id_ks | arm_ks) + gp(t_ks, arm_ks),
    kg ~ offset(id_kg | arm_kg) + gp(t_kg, arm_kg)
  )
)

It has the terms

\[\begin{align} f^{(1)}(\mathbf{x}) &= f^{\text{log-SF}} \left(\text{t} \mid \mathbf{k}_{g}, \mathbf{k}_{s}\right)\\ f^{(2)}(\mathbf{x}) &= f^{\text{BAS}} \left(\text{id} \mid \mathbf{c}_{0}\right) \end{align}\]

However each of \(\mathbf{k}_{g}\) and \(\mathbf{k}_{s}\) are comprised of sub-component terms per ID | arm and HSGP(t, arm).

Fitting

Data

  • We demonstrate these methods on a simulated dataset with 4 arms: A, B, C, D.
  • The simulation is a mechanistic ODE designed to imitate real-world data
  • We sample 30 subjects per arm to simulate a phase II trial.

Data

Data

An important feature of these data is the determination of “clinical progression”

Fits

However …

LOO

When scoring models using PSIS-LOO ELPD, we see a clear difference in model performance

Model elpd_diff se_diff
Alt SF+GP 0.0000 0.00000
SF+GP -594.1177 43.88187
SF -1549.4241 42.48850

Treatment Effects

Can be simulated with treatment_effects method.

Arm-level Effects

Arm-level Effects with censoring

Alt Model Detail

The Alt SF+GP model includes 3 terms for each of the S-F parameters: shrinkage (\(k_s\)) and growth (\(k_g\)):

  1. Constant term
  2. HSGP term: gp(t, arm)
  3. Subject offset: offset(id | arm)

References

Borcoman, E, Y Kanjanapan, S Champiat, S Kato, V Servois, R Kurzrock, S Goel, P Bedard, and C Le Tourneau. 2019. “Novel Patterns of Response Under Immunotherapy.” Ann. Oncol. 30 (3): 385–96.
Seidman, Andrew D, Julia Maues, Tiah Tomlin, Vishal Bhatnagar, and Julia A Beaver. 2020. “The Evolution of Clinical Trials in Metastatic Breast Cancer: Design Features and Endpoints That Matter.” Am. Soc. Clin. Oncol. Educ. Book 40 (40): 1–11.
Solin, Arno, and Simo Särkkä. 2020. “Hilbert Space Methods for Reduced-Rank Gaussian Process Regression.” Stat. Comput. 30 (2): 419–46.
Stein, Wilfred D, William Doug Figg, William Dahut, Aryeh D Stein, Moshe B Hoshen, Doug Price, Susan E Bates, and Tito Fojo. 2008. “Tumor Growth Rates Derived from Data for Patients in a Clinical Trial Correlate Strongly with Patient Survival: A Novel Strategy for Evaluation of Clinical Trial Data.” Oncologist 13 (10): 1046–54.